Goto

Collaborating Authors

 sim2real transfer


Sim2real transfer learning for 3D human pose estimation: motion to the rescue

Neural Information Processing Systems

Synthetic visual data can provide practicically infinite diversity and rich labels, while avoiding ethical issues with privacy and bias. However, for many tasks, current models trained on synthetic data generalize poorly to real data. The task of 3D human pose estimation is a particularly interesting example of this sim2real problem, because learning-based approaches perform reasonably well given real training data, yet labeled 3D poses are extremely difficult to obtain in the wild, limiting scalability. In this paper, we show that standard neural-network approaches, which perform poorly when trained on synthetic RGB images, can perform well when the data is pre-processed to extract cues about the person's motion, notably as optical flow and the motion of 2D keypoints. Therefore, our results suggest that motion can be a simple way to bridge a sim2real gap when video is available. We evaluate on the 3D Poses in the Wild dataset, the most challenging modern benchmark for 3D pose estimation, where we show full 3D mesh recovery that is on par with state-of-the-art methods trained on real 3D sequences, despite training only on synthetic humans from the SURREAL dataset.


Reviews: Sim2real transfer learning for 3D human pose estimation: motion to the rescue

Neural Information Processing Systems

Positives The paper is well-written and includes a through literature review. The following paper is also very relevant to the submission: Shrivastava, Ashish, et al. "Learning from simulated and unsupervised images through adversarial training." Novelty of the method over [44] is not major. Still, I believe no one has shown that computing flow on simulated data and using it for training improves over RGB only (although the improvement is quite marginal). Simulation pipeline proposed in the paper seems to be quite useful.


Reviews: Sim2real transfer learning for 3D human pose estimation: motion to the rescue

Neural Information Processing Systems

After reviewer discussion and rebuttal this paper received three acceptance recommendations. R1 and R2 are more positive about the paper and acknoweldge the contribution. R3 points out that the impact of using just flow and no person and camera motion is limited. Please consider the post-rebuttal portion of the review to include in a final revision. The method, approach and quality of the paper are high as acknowledged by all reviewers.


Sim2real transfer learning for 3D human pose estimation: motion to the rescue

Neural Information Processing Systems

Synthetic visual data can provide practicically infinite diversity and rich labels, while avoiding ethical issues with privacy and bias. However, for many tasks, current models trained on synthetic data generalize poorly to real data. The task of 3D human pose estimation is a particularly interesting example of this sim2real problem, because learning-based approaches perform reasonably well given real training data, yet labeled 3D poses are extremely difficult to obtain in the wild, limiting scalability. In this paper, we show that standard neural-network approaches, which perform poorly when trained on synthetic RGB images, can perform well when the data is pre-processed to extract cues about the person's motion, notably as optical flow and the motion of 2D keypoints. Therefore, our results suggest that motion can be a simple way to bridge a sim2real gap when video is available. We evaluate on the 3D Poses in the Wild dataset, the most challenging modern benchmark for 3D pose estimation, where we show full 3D mesh recovery that is on par with state-of-the-art methods trained on real 3D sequences, despite training only on synthetic humans from the SURREAL dataset.


SplatSim: Zero-Shot Sim2Real Transfer of RGB Manipulation Policies Using Gaussian Splatting

Qureshi, Mohammad Nomaan, Garg, Sparsh, Yandun, Francisco, Held, David, Kantor, George, Silwal, Abhishesh

arXiv.org Artificial Intelligence

Sim2Real transfer, particularly for manipulation policies relying on RGB images, remains a critical challenge in robotics due to the significant domain shift between synthetic and real-world visual data. In this paper, we propose SplatSim, a novel framework that leverages Gaussian Splatting as the primary rendering primitive to reduce the Sim2Real gap for RGB-based manipulation policies. By replacing traditional mesh representations with Gaussian Splats in simulators, SplatSim produces highly photorealistic synthetic data while maintaining the scalability and cost-efficiency of simulation. We demonstrate the effectiveness of our framework by training manipulation policies within SplatSim}and deploying them in the real world in a zero-shot manner, achieving an average success rate of 86.25%, compared to 97.5% for policies trained on real-world data.


Scaling Law of Sim2Real Transfer Learning in Expanding Computational Materials Databases for Real-World Predictions

Minami, Shunya, Hayashi, Yoshihiro, Wu, Stephen, Fukumizu, Kenji, Sugisawa, Hiroki, Ishii, Masashi, Kuwajima, Isao, Shiratori, Kazuya, Yoshida, Ryo

arXiv.org Artificial Intelligence

To address the challenge of limited experimental materials data, extensive physical property databases are being developed based on high-throughput computational experiments, such as molecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrained on a computational database to a real system can result in models with outstanding generalization capabilities compared to learning from scratch. This study demonstrates the scaling law of simulation-to-real (Sim2Real) transfer learning for several machine learning tasks in materials science. Case studies of three prediction tasks for polymers and inorganic materials reveal that the prediction error on real systems decreases according to a power-law as the size of the computational data increases. Observing the scaling behavior offers various insights for database development, such as determining the sample size necessary to achieve a desired performance, identifying equivalent sample sizes for physical and computational experiments, and guiding the design of data production protocols for downstream real-world tasks.


Sim2Real Manipulation on Unknown Objects with Tactile-based Reinforcement Learning

Su, Entong, Jia, Chengzhe, Qin, Yuzhe, Zhou, Wenxuan, Macaluso, Annabella, Huang, Binghao, Wang, Xiaolong

arXiv.org Artificial Intelligence

Using tactile sensors for manipulation remains one of the most challenging problems in robotics. At the heart of these challenges is generalization: How can we train a tactile-based policy that can manipulate unseen and diverse objects? In this paper, we propose to perform Reinforcement Learning with only visual tactile sensing inputs on diverse objects in a physical simulator. By training with diverse objects in simulation, it enables the policy to generalize to unseen objects. However, leveraging simulation introduces the Sim2Real transfer problem. To mitigate this problem, we study different tactile representations and evaluate how each affects real-robot manipulation results after transfer. We conduct our experiments on diverse real-world objects and show significant improvements over baselines for the pivoting task. Our project page is available at https://tactilerl.github.io/.


Learning to Fly in Seconds

Eschmann, Jonas, Albani, Dario, Loianno, Giuseppe

arXiv.org Artificial Intelligence

Learning-based methods, particularly Reinforcement Learning (RL), hold great promise for streamlining deployment, enhancing performance, and achieving generalization in the control of autonomous multirotor aerial vehicles. Deep RL has been able to control complex systems with impressive fidelity and agility in simulation but the simulation-to-reality transfer often brings a hard-to-bridge reality gap. Moreover, RL is commonly plagued by prohibitively long training times. In this work, we propose a novel asymmetric actor-critic-based architecture coupled with a highly reliable RL-based training paradigm for end-to-end quadrotor control. We show how curriculum learning and a highly optimized simulator enhance sample complexity and lead to fast training times. To precisely discuss the challenges related to low-level/end-to-end multirotor control, we also introduce a taxonomy that classifies the existing levels of control abstractions as well as non-linearities and domain parameters. Our framework enables Simulation-to-Reality (Sim2Real) transfer for direct RPM control after only 18 seconds of training on a consumer-grade laptop as well as its deployment on microcontrollers to control a multirotor under real-time guarantees. Finally, our solution exhibits competitive performance in trajectory tracking, as demonstrated through various experimental comparisons with existing state-of-the-art control solutions using a real Crazyflie nano quadrotor. We open source the code including a very fast multirotor dynamics simulator that can simulate about 5 months of flight per second on a laptop GPU. The fast training times and deployment to a cheap, off-the-shelf quadrotor lower the barriers to entry and help democratize the research and development of these systems.


World Model Based Sim2Real Transfer for Visual Navigation

Liu, Chen, Lekkala, Kiran, Itti, Laurent

arXiv.org Artificial Intelligence

Sim2Real transfer has gained popularity because it helps transfer from inexpensive simulators to real world. This paper presents a novel system that fuses components in a traditional \textit{World Model} into a robust system, trained entirely within a simulator, that \textit{Zero-Shot} transfers to the real world. To facilitate transfer, we use an intermediary representation that are based on \textit{Bird's Eye View (BEV)} images. Thus, our robot learns to navigate in a simulator by first learning to translate from complex \textit{First-Person View (FPV)} based RGB images to BEV representations, then learning to navigate using those representations. Later, when tested in the real world, the robot uses the perception model that translates FPV-based RGB images to embeddings that are used by the downstream policy. The incorporation of state-checking modules using \textit{Anchor images} and \textit{Mixture Density LSTM} not only interpolates uncertain and missing observations but also enhances the robustness of the model when exposed to the real-world environment. We trained the model using data collected using a \textit{Differential drive} robot in the CARLA simulator. Our methodology's effectiveness is shown through the deployment of trained models onto a \textit{Real world Differential drive} robot. Lastly we release a comprehensive codebase, dataset and models for training and deployment that are available to the public.


Towards Sim2Real Transfer of Autonomy Algorithms using AutoDRIVE Ecosystem

Samak, Chinmay Vilas, Samak, Tanmay Vilas, Krovi, Venkat

arXiv.org Artificial Intelligence

Abstract: The engineering community currently encounters significant challenges in the development of intelligent transportation algorithms that can be transferred from simulation to reality with minimal effort. This can be achieved by robustifying the algorithms using domain adaptation methods and/or by adopting cutting-edge tools that help support this objective seamlessly. This work presents AutoDRIVE, an openly accessible digital twin ecosystem designed to facilitate synergistic development, simulation and deployment of cyber-physical solutions pertaining to autonomous driving technology; and focuses on bridging the autonomy-oriented simulation-to-reality (sim2real) gap using the proposed ecosystem. In this paper, we extensively explore the modeling and simulation aspects of the ecosystem and substantiate its efficacy by demonstrating the successful transition of two candidate autonomy algorithms from simulation to reality to help support our claims: (i) autonomous parking using probabilistic robotics approach; (ii) behavioral cloning using deep imitation learning. The outcomes of these case studies further strengthen the credibility of AutoDRIVE as an invaluable tool for advancing the state-of-the-art in autonomous driving technology. Keywords: Autonomous Vehicles; Mobile Robots; Digital Twins; Sim2Real; Real2Sim 1. INTRODUCTION The progression of connected autonomous vehicles (CAVs) necessitates a dual approach of cutting-edge research and comprehensive education.